Overview

Dataset statistics

Number of variables30
Number of observations40031
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.5 MiB
Average record size in memory248.0 B

Variable types

Numeric13
DateTime2
Categorical13
Unsupported2

Warnings

year_number has constant value "2019" Constant
Product_SKU has a high cardinality: 1113 distinct values High cardinality
Product_Description has a high cardinality: 403 distinct values High cardinality
Transaction_ID is highly correlated with week_number and 1 other fieldsHigh correlation
week_number is highly correlated with Transaction_ID and 1 other fieldsHigh correlation
month_number is highly correlated with Transaction_ID and 1 other fieldsHigh correlation
Location is highly correlated with year_numberHigh correlation
Product_Category is highly correlated with year_number and 1 other fieldsHigh correlation
year_number is highly correlated with Location and 9 other fieldsHigh correlation
Coupon_Code is highly correlated with year_number and 2 other fieldsHigh correlation
User_type is highly correlated with year_numberHigh correlation
GST is highly correlated with Product_Category and 2 other fieldsHigh correlation
Coupon_Status is highly correlated with year_numberHigh correlation
Discount_pct is highly correlated with year_number and 1 other fieldsHigh correlation
Gender is highly correlated with year_numberHigh correlation
Visit_days_average is highly correlated with year_numberHigh correlation
revenue_seg is highly correlated with year_numberHigh correlation
Quantity is highly skewed (γ1 = 20.23009436) Skewed
revenue is highly skewed (γ1 = 42.29610585) Skewed
Transaction_Date_Month_x is an unsupported type, check if it needs cleaning or further analysis Unsupported
Transaction_Date_Month_y is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2022-01-08 19:31:27.458280
Analysis finished2022-01-08 19:31:58.988886
Duration31.53 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

CustomerID
Real number (ℝ≥0)

Distinct734
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15317.42497
Minimum12347
Maximum18283
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:01:59.073353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12625
Q113815
median15281
Q316946.5
95-th percentile17961
Maximum18283
Range5936
Interquartile range (IQR)3131.5

Descriptive statistics

Standard deviation1767.723598
Coefficient of variation (CV)0.1154060556
Kurtosis-1.229290173
Mean15317.42497
Median Absolute Deviation (MAD)1577
Skewness-0.007157434299
Sum613171839
Variance3124846.719
MonotocityNot monotonic
2022-01-09T01:01:59.191910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12748695
 
1.7%
15311587
 
1.5%
14606575
 
1.4%
17841572
 
1.4%
14911523
 
1.3%
13089366
 
0.9%
15039315
 
0.8%
17850297
 
0.7%
14646290
 
0.7%
13081261
 
0.7%
Other values (724)35550
88.8%
ValueCountFrequency (%)
1234760
0.1%
1234823
 
0.1%
1237091
0.2%
1237777
0.2%
1238369
0.2%
ValueCountFrequency (%)
18283102
0.3%
182698
 
< 0.1%
1826040
 
0.1%
1824555
0.1%
1823952
0.1%

Transaction_ID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct19240
Distinct (%)48.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32218.71512
Minimum16679
Maximum48468
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:01:59.363745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum16679
5-th percentile18778.5
Q125232
median32405
Q338731.5
95-th percentile46379
Maximum48468
Range31789
Interquartile range (IQR)13499.5

Descriptive statistics

Standard deviation8516.175972
Coefficient of variation (CV)0.2643238857
Kurtosis-0.996507855
Mean32218.71512
Median Absolute Deviation (MAD)6795
Skewness0.04324610802
Sum1289747385
Variance72525253.18
MonotocityNot monotonic
2022-01-09T01:01:59.488717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3409428
 
0.1%
3805927
 
0.1%
2482024
 
0.1%
3322823
 
0.1%
3939223
 
0.1%
3418922
 
0.1%
3608222
 
0.1%
3252622
 
0.1%
3366821
 
0.1%
3687121
 
0.1%
Other values (19230)39798
99.4%
ValueCountFrequency (%)
166791
 
< 0.1%
166801
 
< 0.1%
166811
 
< 0.1%
1668210
< 0.1%
166842
 
< 0.1%
ValueCountFrequency (%)
484681
< 0.1%
484672
< 0.1%
484661
< 0.1%
484651
< 0.1%
484641
< 0.1%
Distinct365
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Minimum2019-01-01 00:00:00
Maximum2019-12-31 00:00:00
2022-01-09T01:01:59.629308image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:59.754311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Product_SKU
Categorical

HIGH CARDINALITY

Distinct1113
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
GGOENEBJ079499
 
2660
GGOENEBQ078999
 
2579
GGOENEBB078899
 
2482
GGOENEBQ079099
 
1037
GGOENEBQ079199
 
812
Other values (1108)
30461 

Length

Max length14
Median length14
Mean length13.99980015
Min length12

Characters and Unicode

Total characters560426
Distinct characters34
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)0.2%

Sample

1st rowGGOENEBJ079499
2nd rowGGOENEBJ079499
3rd rowGGOENEBQ078999
4th rowGGOENEBQ079099
5th rowGGOENEBJ079499
ValueCountFrequency (%)
GGOENEBJ0794992660
 
6.6%
GGOENEBQ0789992579
 
6.4%
GGOENEBB0788992482
 
6.2%
GGOENEBQ0790991037
 
2.6%
GGOENEBQ079199812
 
2.0%
GGOENEBQ084699809
 
2.0%
GGOENEBQ086799614
 
1.5%
GGOEGFKQ020399598
 
1.5%
GGOENEBQ086499444
 
1.1%
GGOEGDHC018299433
 
1.1%
Other values (1103)27563
68.9%
2022-01-09T01:02:00.082326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ggoenebj0794992660
 
6.6%
ggoenebq0789992579
 
6.4%
ggoenebb0788992482
 
6.2%
ggoenebq0790991037
 
2.6%
ggoenebq079199812
 
2.0%
ggoenebq084699809
 
2.0%
ggoenebq086799614
 
1.5%
ggoegfkq020399598
 
1.5%
ggoenebq086499444
 
1.1%
ggoegdhc018299433
 
1.1%
Other values (1103)27563
68.9%

Most occurring characters

ValueCountFrequency (%)
G103945
18.5%
965110
11.6%
E55623
9.9%
050244
 
9.0%
O43781
 
7.8%
126606
 
4.7%
A24419
 
4.4%
B22933
 
4.1%
718982
 
3.4%
818190
 
3.2%
Other values (24)130593
23.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter320236
57.1%
Decimal Number240190
42.9%

Most frequent character per category

ValueCountFrequency (%)
G103945
32.5%
E55623
17.4%
O43781
13.7%
A24419
 
7.6%
B22933
 
7.2%
N12674
 
4.0%
Q11990
 
3.7%
J8629
 
2.7%
H6110
 
1.9%
C5404
 
1.7%
Other values (14)24728
 
7.7%
ValueCountFrequency (%)
965110
27.1%
050244
20.9%
126606
11.1%
718982
 
7.9%
818190
 
7.6%
314847
 
6.2%
413100
 
5.5%
611313
 
4.7%
211154
 
4.6%
510644
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Latin320236
57.1%
Common240190
42.9%

Most frequent character per script

ValueCountFrequency (%)
G103945
32.5%
E55623
17.4%
O43781
13.7%
A24419
 
7.6%
B22933
 
7.2%
N12674
 
4.0%
Q11990
 
3.7%
J8629
 
2.7%
H6110
 
1.9%
C5404
 
1.7%
Other values (14)24728
 
7.7%
ValueCountFrequency (%)
965110
27.1%
050244
20.9%
126606
11.1%
718982
 
7.9%
818190
 
7.6%
314847
 
6.2%
413100
 
5.5%
611313
 
4.7%
211154
 
4.6%
510644
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII560426
100.0%

Most frequent character per block

ValueCountFrequency (%)
G103945
18.5%
965110
11.6%
E55623
9.9%
050244
 
9.0%
O43781
 
7.8%
126606
 
4.7%
A24419
 
4.4%
B22933
 
4.1%
718982
 
3.4%
818190
 
3.2%
Other values (24)130593
23.3%

Product_Description
Categorical

HIGH CARDINALITY

Distinct403
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Nest Learning Thermostat 3rd Gen-USA - Stainless Steel
 
2660
Nest Cam Outdoor Security Camera - USA
 
2579
Nest Cam Indoor Security Camera - USA
 
2482
Google Sunglasses
 
1185
Nest Protect Smoke + CO White Battery Alarm-USA
 
1037
Other values (398)
30088 

Length

Max length59
Median length37
Mean length34.23591716
Min length8

Characters and Unicode

Total characters1370498
Distinct characters74
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowNest Learning Thermostat 3rd Gen-USA - Stainless Steel
2nd rowNest Learning Thermostat 3rd Gen-USA - Stainless Steel
3rd rowNest Cam Outdoor Security Camera - USA
4th rowNest Protect Smoke + CO White Battery Alarm-USA
5th rowNest Learning Thermostat 3rd Gen-USA - Stainless Steel
ValueCountFrequency (%)
Nest Learning Thermostat 3rd Gen-USA - Stainless Steel2660
 
6.6%
Nest Cam Outdoor Security Camera - USA2579
 
6.4%
Nest Cam Indoor Security Camera - USA2482
 
6.2%
Google Sunglasses1185
 
3.0%
Nest Protect Smoke + CO White Battery Alarm-USA1037
 
2.6%
Nest Protect Smoke + CO White Wired Alarm-USA812
 
2.0%
Nest Learning Thermostat 3rd Gen-USA - White809
 
2.0%
Google 22 oz Water Bottle679
 
1.7%
Nest Thermostat E - USA614
 
1.5%
Google Laptop and Cell Phone Stickers598
 
1.5%
Other values (393)26576
66.4%
2022-01-09T01:02:00.426028image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
google16442
 
7.2%
13328
 
5.8%
nest12572
 
5.5%
tee8786
 
3.9%
men's7001
 
3.1%
usa6672
 
2.9%
sleeve6126
 
2.7%
cam5725
 
2.5%
short5548
 
2.4%
camera5169
 
2.3%
Other values (392)140633
61.7%

Most occurring characters

ValueCountFrequency (%)
188328
 
13.7%
e172983
 
12.6%
o99814
 
7.3%
t81161
 
5.9%
a69444
 
5.1%
r68473
 
5.0%
l61658
 
4.5%
n51003
 
3.7%
S47393
 
3.5%
s44653
 
3.3%
Other values (64)485588
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter892197
65.1%
Uppercase Letter241231
 
17.6%
Space Separator188328
 
13.7%
Dash Punctuation17510
 
1.3%
Decimal Number15067
 
1.1%
Other Punctuation13889
 
1.0%
Math Symbol1932
 
0.1%
Currency Symbol124
 
< 0.1%
Open Punctuation110
 
< 0.1%
Close Punctuation110
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
S47393
19.6%
G23249
9.6%
C21458
8.9%
T18833
 
7.8%
A18228
 
7.6%
N14732
 
6.1%
B13872
 
5.8%
U12575
 
5.2%
W10699
 
4.4%
M9491
 
3.9%
Other values (16)50701
21.0%
ValueCountFrequency (%)
e172983
19.4%
o99814
11.2%
t81161
9.1%
a69444
 
7.8%
r68473
 
7.7%
l61658
 
6.9%
n51003
 
5.7%
s44653
 
5.0%
i36408
 
4.1%
g31228
 
3.5%
Other values (16)175372
19.7%
ValueCountFrequency (%)
34080
27.1%
03154
20.9%
12665
17.7%
22592
17.2%
4965
 
6.4%
5698
 
4.6%
7418
 
2.8%
6292
 
1.9%
8181
 
1.2%
922
 
0.1%
ValueCountFrequency (%)
'10220
73.6%
/1443
 
10.4%
%1337
 
9.6%
&652
 
4.7%
.124
 
0.9%
;113
 
0.8%
ValueCountFrequency (%)
188328
100.0%
ValueCountFrequency (%)
-17510
100.0%
ValueCountFrequency (%)
+1932
100.0%
ValueCountFrequency (%)
(110
100.0%
ValueCountFrequency (%)
)110
100.0%
ValueCountFrequency (%)
$124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1133428
82.7%
Common237070
 
17.3%

Most frequent character per script

ValueCountFrequency (%)
e172983
15.3%
o99814
 
8.8%
t81161
 
7.2%
a69444
 
6.1%
r68473
 
6.0%
l61658
 
5.4%
n51003
 
4.5%
S47393
 
4.2%
s44653
 
3.9%
i36408
 
3.2%
Other values (42)400438
35.3%
ValueCountFrequency (%)
188328
79.4%
-17510
 
7.4%
'10220
 
4.3%
34080
 
1.7%
03154
 
1.3%
12665
 
1.1%
22592
 
1.1%
+1932
 
0.8%
/1443
 
0.6%
%1337
 
0.6%
Other values (12)3809
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1370498
100.0%

Most frequent character per block

ValueCountFrequency (%)
188328
 
13.7%
e172983
 
12.6%
o99814
 
7.3%
t81161
 
5.9%
a69444
 
5.1%
r68473
 
5.0%
l61658
 
4.5%
n51003
 
3.7%
S47393
 
3.5%
s44653
 
3.3%
Other values (64)485588
35.4%

Product_Category
Categorical

HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Apparel
13696 
Nest-USA
10708 
Office
4872 
Drinkware
2614 
Lifestyle
2377 
Other values (15)
5764 

Length

Max length20
Median length7
Mean length7.383727611
Min length3

Characters and Unicode

Total characters295578
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNest-USA
2nd rowNest-USA
3rd rowNest-USA
4th rowNest-USA
5th rowNest-USA
ValueCountFrequency (%)
Apparel13696
34.2%
Nest-USA10708
26.7%
Office4872
 
12.2%
Drinkware2614
 
6.5%
Lifestyle2377
 
5.9%
Nest1621
 
4.0%
Bags1400
 
3.5%
Headgear581
 
1.5%
Notebooks & Journals568
 
1.4%
Waze422
 
1.1%
Other values (10)1172
 
2.9%
2022-01-09T01:02:00.707177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
apparel13696
33.1%
nest-usa10708
25.9%
office4872
 
11.8%
drinkware2614
 
6.3%
lifestyle2377
 
5.8%
nest1621
 
3.9%
bags1435
 
3.5%
headgear581
 
1.4%
568
 
1.4%
journals568
 
1.4%
Other values (13)2286
 
5.5%

Most occurring characters

ValueCountFrequency (%)
e41489
14.0%
p27464
 
9.3%
A24601
 
8.3%
a20982
 
7.1%
r20517
 
6.9%
s18595
 
6.3%
l16931
 
5.7%
t16063
 
5.4%
N13140
 
4.4%
f12245
 
4.1%
Other values (28)83551
28.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter209639
70.9%
Uppercase Letter73125
 
24.7%
Dash Punctuation10951
 
3.7%
Space Separator1295
 
0.4%
Other Punctuation568
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
e41489
19.8%
p27464
13.1%
a20982
10.0%
r20517
9.8%
s18595
8.9%
l16931
8.1%
t16063
 
7.7%
f12245
 
5.8%
i10184
 
4.9%
c5344
 
2.5%
Other values (10)19825
9.5%
ValueCountFrequency (%)
A24601
33.6%
N13140
18.0%
U10708
14.6%
S10708
14.6%
O4872
 
6.7%
D2614
 
3.6%
L2377
 
3.3%
B1718
 
2.3%
H669
 
0.9%
J568
 
0.8%
Other values (5)1150
 
1.6%
ValueCountFrequency (%)
-10951
100.0%
ValueCountFrequency (%)
1295
100.0%
ValueCountFrequency (%)
&568
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin282764
95.7%
Common12814
 
4.3%

Most frequent character per script

ValueCountFrequency (%)
e41489
14.7%
p27464
 
9.7%
A24601
 
8.7%
a20982
 
7.4%
r20517
 
7.3%
s18595
 
6.6%
l16931
 
6.0%
t16063
 
5.7%
N13140
 
4.6%
f12245
 
4.3%
Other values (25)70737
25.0%
ValueCountFrequency (%)
-10951
85.5%
1295
 
10.1%
&568
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII295578
100.0%

Most frequent character per block

ValueCountFrequency (%)
e41489
14.0%
p27464
 
9.3%
A24601
 
8.3%
a20982
 
7.1%
r20517
 
6.9%
s18595
 
6.3%
l16931
 
5.7%
t16063
 
5.4%
N13140
 
4.4%
f12245
 
4.1%
Other values (28)83551
28.3%

Quantity
Real number (ℝ≥0)

SKEWED

Distinct127
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.415378082
Minimum1
Maximum900
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:00.863424image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile15
Maximum900
Range899
Interquartile range (IQR)1

Descriptive statistics

Standard deviation20.26703266
Coefficient of variation (CV)4.590101297
Kurtosis584.0241908
Mean4.415378082
Median Absolute Deviation (MAD)0
Skewness20.23009436
Sum176752
Variance410.7526129
MonotocityNot monotonic
2022-01-09T01:02:00.988396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126781
66.9%
25333
 
13.3%
31756
 
4.4%
51283
 
3.2%
4929
 
2.3%
10779
 
1.9%
20426
 
1.1%
6317
 
0.8%
15279
 
0.7%
25221
 
0.6%
Other values (117)1927
 
4.8%
ValueCountFrequency (%)
126781
66.9%
25333
 
13.3%
31756
 
4.4%
4929
 
2.3%
51283
 
3.2%
ValueCountFrequency (%)
9001
 
< 0.1%
8252
< 0.1%
7911
 
< 0.1%
7501
 
< 0.1%
6003
< 0.1%

Avg_Price
Real number (ℝ≥0)

Distinct506
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.41615373
Minimum0.39
Maximum355.74
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:01.144605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.39
5-th percentile1.99
Q15.7
median16.99
Q3119
95-th percentile151.88
Maximum355.74
Range355.35
Interquartile range (IQR)113.3

Descriptive statistics

Standard deviation63.92781142
Coefficient of variation (CV)1.21962042
Kurtosis3.306880462
Mean52.41615373
Median Absolute Deviation (MAD)14.19
Skewness1.62114731
Sum2098271.05
Variance4086.765073
MonotocityNot monotonic
2022-01-09T01:02:01.269544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1194041
 
10.1%
1492982
 
7.4%
791509
 
3.8%
13.591160
 
2.9%
2.99970
 
2.4%
2.39955
 
2.4%
16.99843
 
2.1%
15.19771
 
1.9%
1.99745
 
1.9%
3.99736
 
1.8%
Other values (496)25319
63.2%
ValueCountFrequency (%)
0.391
 
< 0.1%
0.435
0.1%
0.4111
 
< 0.1%
0.526
0.1%
0.5120
< 0.1%
ValueCountFrequency (%)
355.74129
0.3%
349245
0.6%
279109
0.3%
274.191
 
< 0.1%
2691
 
< 0.1%

Delivery_Charges
Real number (ℝ≥0)

Distinct241
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.44624541
Minimum0
Maximum521.36
Zeros127
Zeros (%)0.3%
Memory size625.5 KiB
2022-01-09T01:02:01.410172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q16
median6
Q36.5
95-th percentile26.43
Maximum521.36
Range521.36
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation18.43257371
Coefficient of variation (CV)1.764516627
Kurtosis220.1261797
Mean10.44624541
Median Absolute Deviation (MAD)0
Skewness11.89477448
Sum418173.65
Variance339.7597736
MonotocityNot monotonic
2022-01-09T01:02:01.550728image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
620363
50.9%
6.511923
29.8%
12.991949
 
4.9%
19.99747
 
1.9%
12.48555
 
1.4%
12.91328
 
0.8%
8.7250
 
0.6%
0127
 
0.3%
18.47111
 
0.3%
7587
 
0.2%
Other values (231)3591
 
9.0%
ValueCountFrequency (%)
0127
 
0.3%
620363
50.9%
6.469
 
< 0.1%
6.4821
 
0.1%
6.511923
29.8%
ValueCountFrequency (%)
521.361
 
< 0.1%
492.8410
< 0.1%
422.244
 
< 0.1%
3543
 
< 0.1%
323.474
 
< 0.1%

Coupon_Status
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Clicked
20359 
Used
13572 
Not Used
6100 

Length

Max length8
Median length7
Mean length6.135270166
Min length4

Characters and Unicode

Total characters245601
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUsed
2nd rowUsed
3rd rowNot Used
4th rowClicked
5th rowClicked
ValueCountFrequency (%)
Clicked20359
50.9%
Used13572
33.9%
Not Used6100
 
15.2%
2022-01-09T01:02:01.800702image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:01.886919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
clicked20359
44.1%
used19672
42.6%
not6100
 
13.2%

Most occurring characters

ValueCountFrequency (%)
e40031
16.3%
d40031
16.3%
C20359
8.3%
l20359
8.3%
i20359
8.3%
c20359
8.3%
k20359
8.3%
U19672
8.0%
s19672
8.0%
N6100
 
2.5%
Other values (3)18300
7.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter193370
78.7%
Uppercase Letter46131
 
18.8%
Space Separator6100
 
2.5%

Most frequent character per category

ValueCountFrequency (%)
e40031
20.7%
d40031
20.7%
l20359
10.5%
i20359
10.5%
c20359
10.5%
k20359
10.5%
s19672
10.2%
o6100
 
3.2%
t6100
 
3.2%
ValueCountFrequency (%)
C20359
44.1%
U19672
42.6%
N6100
 
13.2%
ValueCountFrequency (%)
6100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin239501
97.5%
Common6100
 
2.5%

Most frequent character per script

ValueCountFrequency (%)
e40031
16.7%
d40031
16.7%
C20359
8.5%
l20359
8.5%
i20359
8.5%
c20359
8.5%
k20359
8.5%
U19672
8.2%
s19672
8.2%
N6100
 
2.5%
Other values (2)12200
 
5.1%
ValueCountFrequency (%)
6100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII245601
100.0%

Most frequent character per block

ValueCountFrequency (%)
e40031
16.3%
d40031
16.3%
C20359
8.3%
l20359
8.3%
i20359
8.3%
c20359
8.3%
k20359
8.3%
U19672
8.0%
s19672
8.0%
N6100
 
2.5%
Other values (3)18300
7.5%

Gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
F
24966 
M
15065 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters40031
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowM
ValueCountFrequency (%)
F24966
62.4%
M15065
37.6%
2022-01-09T01:02:02.105587image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:02.183724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
f24966
62.4%
m15065
37.6%

Most occurring characters

ValueCountFrequency (%)
F24966
62.4%
M15065
37.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter40031
100.0%

Most frequent character per category

ValueCountFrequency (%)
F24966
62.4%
M15065
37.6%

Most occurring scripts

ValueCountFrequency (%)
Latin40031
100.0%

Most frequent character per script

ValueCountFrequency (%)
F24966
62.4%
M15065
37.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII40031
100.0%

Most frequent character per block

ValueCountFrequency (%)
F24966
62.4%
M15065
37.6%

Location
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Chicago
14584 
California
11805 
New York
8374 
New Jersey
3182 
Washington DC
2086 

Length

Max length13
Median length8
Mean length8.645000125
Min length7

Characters and Unicode

Total characters346068
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChicago
2nd rowChicago
3rd rowChicago
4th rowChicago
5th rowChicago
ValueCountFrequency (%)
Chicago14584
36.4%
California11805
29.5%
New York8374
20.9%
New Jersey3182
 
7.9%
Washington DC2086
 
5.2%
2022-01-09T01:02:02.402423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:02.480496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
chicago14584
27.2%
california11805
22.0%
new11556
21.5%
york8374
15.6%
jersey3182
 
5.9%
dc2086
 
3.9%
washington2086
 
3.9%

Most occurring characters

ValueCountFrequency (%)
i40280
11.6%
a40280
11.6%
o36849
 
10.6%
C28475
 
8.2%
r23361
 
6.8%
e17920
 
5.2%
h16670
 
4.8%
g16670
 
4.8%
n15977
 
4.6%
c14584
 
4.2%
Other values (13)95002
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter276667
79.9%
Uppercase Letter55759
 
16.1%
Space Separator13642
 
3.9%

Most frequent character per category

ValueCountFrequency (%)
i40280
14.6%
a40280
14.6%
o36849
13.3%
r23361
8.4%
e17920
6.5%
h16670
 
6.0%
g16670
 
6.0%
n15977
 
5.8%
c14584
 
5.3%
l11805
 
4.3%
Other values (6)42271
15.3%
ValueCountFrequency (%)
C28475
51.1%
N11556
20.7%
Y8374
 
15.0%
J3182
 
5.7%
W2086
 
3.7%
D2086
 
3.7%
ValueCountFrequency (%)
13642
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin332426
96.1%
Common13642
 
3.9%

Most frequent character per script

ValueCountFrequency (%)
i40280
12.1%
a40280
12.1%
o36849
11.1%
C28475
 
8.6%
r23361
 
7.0%
e17920
 
5.4%
h16670
 
5.0%
g16670
 
5.0%
n15977
 
4.8%
c14584
 
4.4%
Other values (12)81360
24.5%
ValueCountFrequency (%)
13642
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII346068
100.0%

Most frequent character per block

ValueCountFrequency (%)
i40280
11.6%
a40280
11.6%
o36849
 
10.6%
C28475
 
8.2%
r23361
 
6.8%
e17920
 
5.2%
h16670
 
4.8%
g16670
 
4.8%
n15977
 
4.6%
c14584
 
4.2%
Other values (13)95002
27.5%

Tenure_Months
Real number (ℝ≥0)

Distinct49
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.9437436
Minimum2
Maximum50
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:02.621120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5
Q114
median27
Q337
95-th percentile46
Maximum50
Range48
Interquartile range (IQR)23

Descriptive statistics

Standard deviation13.33459876
Coefficient of variation (CV)0.5139812884
Kurtosis-1.095924454
Mean25.9437436
Median Absolute Deviation (MAD)11
Skewness-0.0922319672
Sum1038554
Variance177.8115241
MonotocityNot monotonic
2022-01-09T01:02:02.761682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
401653
 
4.1%
251632
 
4.1%
301493
 
3.7%
341434
 
3.6%
51421
 
3.5%
331410
 
3.5%
211263
 
3.2%
281242
 
3.1%
451176
 
2.9%
101146
 
2.9%
Other values (39)26161
65.4%
ValueCountFrequency (%)
2467
 
1.2%
3561
 
1.4%
4706
1.8%
51421
3.5%
61034
2.6%
ValueCountFrequency (%)
50443
1.1%
49412
1.0%
48633
1.6%
47349
0.9%
46610
1.5%

GST
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
0.18
20667 
0.1
16171 
0.05
3105 
0.12
 
88

Length

Max length4
Median length4
Mean length3.59603807
Min length3

Characters and Unicode

Total characters143953
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.1
2nd row0.1
3rd row0.1
4th row0.1
5th row0.1
ValueCountFrequency (%)
0.1820667
51.6%
0.116171
40.4%
0.053105
 
7.8%
0.1288
 
0.2%
2022-01-09T01:02:03.011621image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:03.089738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.1820667
51.6%
0.116171
40.4%
0.053105
 
7.8%
0.1288
 
0.2%

Most occurring characters

ValueCountFrequency (%)
043136
30.0%
.40031
27.8%
136926
25.7%
820667
14.4%
53105
 
2.2%
288
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number103922
72.2%
Other Punctuation40031
 
27.8%

Most frequent character per category

ValueCountFrequency (%)
043136
41.5%
136926
35.5%
820667
19.9%
53105
 
3.0%
288
 
0.1%
ValueCountFrequency (%)
.40031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common143953
100.0%

Most frequent character per script

ValueCountFrequency (%)
043136
30.0%
.40031
27.8%
136926
25.7%
820667
14.4%
53105
 
2.2%
288
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII143953
100.0%

Most frequent character per block

ValueCountFrequency (%)
043136
30.0%
.40031
27.8%
136926
25.7%
820667
14.4%
53105
 
2.2%
288
 
0.1%

Transaction_Date_Month_x
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size625.5 KiB

Transaction_Date_Month_y
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size625.5 KiB

User_type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Existing
20891 
New
19140 

Length

Max length8
Median length8
Mean length5.609352752
Min length3

Characters and Unicode

Total characters224548
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew
2nd rowNew
3rd rowNew
4th rowNew
5th rowNew
ValueCountFrequency (%)
Existing20891
52.2%
New19140
47.8%
2022-01-09T01:02:03.339703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:03.417777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
existing20891
52.2%
new19140
47.8%

Most occurring characters

ValueCountFrequency (%)
i41782
18.6%
E20891
9.3%
x20891
9.3%
s20891
9.3%
t20891
9.3%
n20891
9.3%
g20891
9.3%
N19140
8.5%
e19140
8.5%
w19140
8.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter184517
82.2%
Uppercase Letter40031
 
17.8%

Most frequent character per category

ValueCountFrequency (%)
i41782
22.6%
x20891
11.3%
s20891
11.3%
t20891
11.3%
n20891
11.3%
g20891
11.3%
e19140
10.4%
w19140
10.4%
ValueCountFrequency (%)
E20891
52.2%
N19140
47.8%

Most occurring scripts

ValueCountFrequency (%)
Latin224548
100.0%

Most frequent character per script

ValueCountFrequency (%)
i41782
18.6%
E20891
9.3%
x20891
9.3%
s20891
9.3%
t20891
9.3%
n20891
9.3%
g20891
9.3%
N19140
8.5%
e19140
8.5%
w19140
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII224548
100.0%

Most frequent character per block

ValueCountFrequency (%)
i41782
18.6%
E20891
9.3%
x20891
9.3%
s20891
9.3%
t20891
9.3%
n20891
9.3%
g20891
9.3%
N19140
8.5%
e19140
8.5%
w19140
8.5%

year_number
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
2019
40031 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters160124
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019
ValueCountFrequency (%)
201940031
100.0%
2022-01-09T01:02:03.620854image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:03.698993image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
201940031
100.0%

Most occurring characters

ValueCountFrequency (%)
240031
25.0%
040031
25.0%
140031
25.0%
940031
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number160124
100.0%

Most frequent character per category

ValueCountFrequency (%)
240031
25.0%
040031
25.0%
140031
25.0%
940031
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common160124
100.0%

Most frequent character per script

ValueCountFrequency (%)
240031
25.0%
040031
25.0%
140031
25.0%
940031
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII160124
100.0%

Most frequent character per block

ValueCountFrequency (%)
240031
25.0%
040031
25.0%
140031
25.0%
940031
25.0%

week_number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct52
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.83402863
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:03.777100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q115
median28
Q338
95-th percentile50
Maximum52
Range51
Interquartile range (IQR)23

Descriptive statistics

Standard deviation14.32169504
Coefficient of variation (CV)0.5337139362
Kurtosis-1.077566579
Mean26.83402863
Median Absolute Deviation (MAD)12
Skewness-0.03034840189
Sum1074193
Variance205.1109489
MonotocityNot monotonic
2022-01-09T01:02:03.933281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
291234
 
3.1%
311192
 
3.0%
281076
 
2.7%
341046
 
2.6%
331039
 
2.6%
351027
 
2.6%
301026
 
2.6%
32997
 
2.5%
49961
 
2.4%
50930
 
2.3%
Other values (42)29503
73.7%
ValueCountFrequency (%)
1631
1.6%
2585
1.5%
3637
1.6%
4632
1.6%
5568
1.4%
ValueCountFrequency (%)
52428
1.1%
51772
1.9%
50930
2.3%
49961
2.4%
48670
1.7%

month_number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.576078539
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:04.058284image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.280530431
Coefficient of variation (CV)0.4988581587
Kurtosis-1.06641822
Mean6.576078539
Median Absolute Deviation (MAD)3
Skewness-0.0295538037
Sum263247
Variance10.76187991
MonotocityNot monotonic
2022-01-09T01:02:04.152882image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
84654
11.6%
74567
11.4%
53467
8.7%
33394
8.5%
43264
8.2%
123182
7.9%
103116
7.8%
63081
7.7%
92991
7.5%
22819
7.0%
Other values (2)5496
13.7%
ValueCountFrequency (%)
12737
6.8%
22819
7.0%
33394
8.5%
43264
8.2%
53467
8.7%
ValueCountFrequency (%)
123182
7.9%
112759
6.9%
103116
7.8%
92991
7.5%
84654
11.6%

Coupon_Code
Categorical

HIGH CORRELATION

Distinct46
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
SALE20
4828 
SALE10
4513 
SALE30
4355 
ELEC10
3751 
ELEC20
3511 
Other values (41)
19073 

Length

Max length9
Median length6
Mean length5.866378557
Min length4

Characters and Unicode

Total characters234837
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowELEC10
2nd rowELEC10
3rd rowELEC10
4th rowELEC10
5th rowELEC10
ValueCountFrequency (%)
SALE204828
12.1%
SALE104513
11.3%
SALE304355
10.9%
ELEC103751
9.4%
ELEC203511
8.8%
ELEC303446
 
8.6%
EXTRA101814
 
4.5%
OFF101746
 
4.4%
EXTRA201725
 
4.3%
OFF201658
 
4.1%
Other values (36)8684
21.7%
2022-01-09T01:02:04.434120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sale204828
12.0%
sale104513
11.2%
sale304355
10.8%
elec103751
9.3%
elec203511
 
8.7%
elec303446
 
8.5%
extra101814
 
4.5%
off101746
 
4.3%
extra201725
 
4.3%
off201658
 
4.1%
Other values (37)8993
22.3%

Most occurring characters

ValueCountFrequency (%)
E42727
18.2%
039722
16.9%
L24404
10.4%
A21108
9.0%
S13696
 
5.8%
213596
 
5.8%
113559
 
5.8%
312567
 
5.4%
C11712
 
5.0%
F9744
 
4.1%
Other values (20)32002
13.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter153230
65.2%
Decimal Number79444
33.8%
Lowercase Letter1854
 
0.8%
Space Separator309
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
E42727
27.9%
L24404
15.9%
A21108
13.8%
S13696
 
8.9%
C11712
 
7.6%
F9744
 
6.4%
O6360
 
4.2%
R5572
 
3.6%
T5202
 
3.4%
X4991
 
3.3%
Other values (11)7714
 
5.0%
ValueCountFrequency (%)
039722
50.0%
213596
 
17.1%
113559
 
17.1%
312567
 
15.8%
ValueCountFrequency (%)
o927
50.0%
u309
 
16.7%
p309
 
16.7%
n309
 
16.7%
ValueCountFrequency (%)
309
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin155084
66.0%
Common79753
34.0%

Most frequent character per script

ValueCountFrequency (%)
E42727
27.6%
L24404
15.7%
A21108
13.6%
S13696
 
8.8%
C11712
 
7.6%
F9744
 
6.3%
O6360
 
4.1%
R5572
 
3.6%
T5202
 
3.4%
X4991
 
3.2%
Other values (15)9568
 
6.2%
ValueCountFrequency (%)
039722
49.8%
213596
 
17.0%
113559
 
17.0%
312567
 
15.8%
309
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII234837
100.0%

Most frequent character per block

ValueCountFrequency (%)
E42727
18.2%
039722
16.9%
L24404
10.4%
A21108
9.0%
S13696
 
5.8%
213596
 
5.8%
113559
 
5.8%
312567
 
5.4%
C11712
 
5.0%
F9744
 
4.1%
Other values (20)32002
13.6%

Discount_pct
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
20.0
13596 
10.0
13559 
30.0
12567 
0.0
 
309

Length

Max length4
Median length4
Mean length3.992280982
Min length3

Characters and Unicode

Total characters159815
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10.0
2nd row10.0
3rd row10.0
4th row10.0
5th row10.0
ValueCountFrequency (%)
20.013596
34.0%
10.013559
33.9%
30.012567
31.4%
0.0309
 
0.8%
2022-01-09T01:02:04.699650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:04.793411image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
20.013596
34.0%
10.013559
33.9%
30.012567
31.4%
0.0309
 
0.8%

Most occurring characters

ValueCountFrequency (%)
080062
50.1%
.40031
25.0%
213596
 
8.5%
113559
 
8.5%
312567
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number119784
75.0%
Other Punctuation40031
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
080062
66.8%
213596
 
11.4%
113559
 
11.3%
312567
 
10.5%
ValueCountFrequency (%)
.40031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common159815
100.0%

Most frequent character per script

ValueCountFrequency (%)
080062
50.1%
.40031
25.0%
213596
 
8.5%
113559
 
8.5%
312567
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII159815
100.0%

Most frequent character per block

ValueCountFrequency (%)
080062
50.1%
.40031
25.0%
213596
 
8.5%
113559
 
8.5%
312567
 
7.9%
Distinct365
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Minimum2019-01-01 00:00:00
Maximum2019-12-31 00:00:00
2022-01-09T01:02:04.918348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:02:05.043352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Offline_Spend
Real number (ℝ≥0)

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2812.093128
Minimum500
Maximum5000
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:05.183942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile1000
Q12500
median3000
Q33500
95-th percentile4000
Maximum5000
Range4500
Interquartile range (IQR)1000

Descriptive statistics

Standard deviation927.8449336
Coefficient of variation (CV)0.329948153
Kurtosis0.09180464913
Mean2812.093128
Median Absolute Deviation (MAD)500
Skewness-0.3114640595
Sum112570900
Variance860896.2208
MonotocityNot monotonic
2022-01-09T01:02:05.277674image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
300010027
25.0%
25007699
19.2%
35006955
17.4%
20005038
12.6%
40003680
 
9.2%
15001882
 
4.7%
10001377
 
3.4%
45001354
 
3.4%
500826
 
2.1%
700597
 
1.5%
ValueCountFrequency (%)
500826
 
2.1%
700597
 
1.5%
10001377
 
3.4%
15001882
 
4.7%
20005038
12.6%
ValueCountFrequency (%)
5000596
 
1.5%
45001354
 
3.4%
40003680
 
9.2%
35006955
17.4%
300010027
25.0%

Online_Spend
Real number (ℝ≥0)

Distinct365
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1873.168574
Minimum320.25
Maximum4556.93
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:05.402641image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum320.25
5-th percentile687.86
Q11196.03
median1801.66
Q32424.5
95-th percentile3396.14
Maximum4556.93
Range4236.68
Interquartile range (IQR)1228.47

Descriptive statistics

Standard deviation812.5039497
Coefficient of variation (CV)0.4337591187
Kurtosis-0.2152680107
Mean1873.168574
Median Absolute Deviation (MAD)609.73
Skewness0.467008992
Sum74984811.18
Variance660162.6683
MonotocityNot monotonic
2022-01-09T01:02:05.527611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1692.8294
 
0.7%
2819.58276
 
0.7%
985.28260
 
0.6%
1331.1258
 
0.6%
1108.88241
 
0.6%
1542.81224
 
0.6%
1946.56218
 
0.5%
1128.09218
 
0.5%
2489.36216
 
0.5%
1901.56215
 
0.5%
Other values (355)37611
94.0%
ValueCountFrequency (%)
320.25130
0.3%
417.73132
0.3%
465.443
 
0.1%
478.27131
0.3%
484.9113
0.3%
ValueCountFrequency (%)
4556.9383
0.2%
4349.0274
0.2%
4055.3110
0.3%
4019.9361
0.2%
3897.290
0.2%

revenue
Real number (ℝ≥0)

SKEWED

Distinct8077
Distinct (%)20.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean192.2484246
Minimum4.12
Maximum115686
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:05.652584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4.12
5-th percentile8.8
Q119.59
median51.54
Q3155
95-th percentile427.75
Maximum115686
Range115681.88
Interquartile range (IQR)135.41

Descriptive statistics

Standard deviation1450.441573
Coefficient of variation (CV)7.544621372
Kurtosis2484.930398
Mean192.2484246
Median Absolute Deviation (MAD)40.66
Skewness42.29610585
Sum7695896.685
Variance2103780.756
MonotocityNot monotonic
2022-01-09T01:02:05.793174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1251171
 
2.9%
155971
 
2.4%
125.5508
 
1.3%
250411
 
1.0%
19.59397
 
1.0%
155.5388
 
1.0%
85279
 
0.7%
16.63266
 
0.7%
21.99255
 
0.6%
22.99235
 
0.6%
Other values (8067)35150
87.8%
ValueCountFrequency (%)
4.121
 
< 0.1%
4.1851
 
< 0.1%
4.5576
< 0.1%
4.651
 
< 0.1%
4.7532
 
< 0.1%
ValueCountFrequency (%)
1156861
< 0.1%
109452.831
< 0.1%
78582.081
< 0.1%
652501
< 0.1%
60154.8841
< 0.1%

revenue_per_customer
Real number (ℝ≥0)

Distinct734
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27753.46303
Minimum28.863
Maximum257792.206
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:05.933767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum28.863
5-th percentile1853.213
Q15094.736
median10783.312
Q324715.743
95-th percentile118511.709
Maximum257792.206
Range257763.343
Interquartile range (IQR)19621.007

Descriptive statistics

Standard deviation46116.52219
Coefficient of variation (CV)1.661649292
Kurtosis10.84155146
Mean27753.46303
Median Absolute Deviation (MAD)6815.056
Skewness3.170178368
Sum1110998879
Variance2126733619
MonotocityNot monotonic
2022-01-09T01:02:06.074357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
244509.716695
 
1.7%
106406.408587
 
1.5%
118511.709575
 
1.4%
113573.419572
 
1.4%
134516.474523
 
1.3%
36400.311366
 
0.9%
53675.804315
 
0.8%
60381.669297
 
0.7%
25601.119290
 
0.7%
36940.336261
 
0.7%
Other values (724)35550
88.8%
ValueCountFrequency (%)
28.8632
< 0.1%
62.222
< 0.1%
63.2964
< 0.1%
71.182
< 0.1%
82.1922
< 0.1%
ValueCountFrequency (%)
257792.206147
 
0.4%
245611.693157
 
0.4%
244509.716695
1.7%
137971.419163
 
0.4%
134516.474523
1.3%

revenue_seg
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
High_value
16470 
Medium_value
13899 
Low_value
9662 

Length

Max length12
Median length10
Mean length10.45304889
Min length9

Characters and Unicode

Total characters418446
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHigh_value
2nd rowHigh_value
3rd rowHigh_value
4th rowHigh_value
5th rowHigh_value
ValueCountFrequency (%)
High_value16470
41.1%
Medium_value13899
34.7%
Low_value9662
24.1%
2022-01-09T01:02:06.324299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:06.417994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
high_value16470
41.1%
medium_value13899
34.7%
low_value9662
24.1%

Most occurring characters

ValueCountFrequency (%)
u53930
12.9%
e53930
12.9%
_40031
9.6%
v40031
9.6%
a40031
9.6%
l40031
9.6%
i30369
7.3%
H16470
 
3.9%
g16470
 
3.9%
h16470
 
3.9%
Other values (6)70683
16.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter338384
80.9%
Uppercase Letter40031
 
9.6%
Connector Punctuation40031
 
9.6%

Most frequent character per category

ValueCountFrequency (%)
u53930
15.9%
e53930
15.9%
v40031
11.8%
a40031
11.8%
l40031
11.8%
i30369
9.0%
g16470
 
4.9%
h16470
 
4.9%
d13899
 
4.1%
m13899
 
4.1%
Other values (2)19324
 
5.7%
ValueCountFrequency (%)
H16470
41.1%
M13899
34.7%
L9662
24.1%
ValueCountFrequency (%)
_40031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin378415
90.4%
Common40031
 
9.6%

Most frequent character per script

ValueCountFrequency (%)
u53930
14.3%
e53930
14.3%
v40031
10.6%
a40031
10.6%
l40031
10.6%
i30369
8.0%
H16470
 
4.4%
g16470
 
4.4%
h16470
 
4.4%
M13899
 
3.7%
Other values (5)56784
15.0%
ValueCountFrequency (%)
_40031
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII418446
100.0%

Most frequent character per block

ValueCountFrequency (%)
u53930
12.9%
e53930
12.9%
_40031
9.6%
v40031
9.6%
a40031
9.6%
l40031
9.6%
i30369
7.3%
H16470
 
3.9%
g16470
 
3.9%
h16470
 
3.9%
Other values (6)70683
16.9%

days_bw_visits
Real number (ℝ≥0)

Distinct382
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.69418114
Minimum1
Maximum351
Zeros0
Zeros (%)0.0%
Memory size625.5 KiB
2022-01-09T01:02:06.527375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q115.26086957
median43.66666667
Q374
95-th percentile158
Maximum351
Range350
Interquartile range (IQR)58.73913043

Descriptive statistics

Standard deviation52.57700312
Coefficient of variation (CV)0.9612906167
Kurtosis4.667905908
Mean54.69418114
Median Absolute Deviation (MAD)28.4057971
Skewness1.804998284
Sum2189462.765
Variance2764.341257
MonotocityNot monotonic
2022-01-09T01:02:06.652346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15490
 
13.7%
7.575757576695
 
1.7%
15.26086957587
 
1.5%
16584
 
1.5%
13.42307692575
 
1.4%
17.73684211572
 
1.4%
13.76523
 
1.3%
40.6376
 
0.9%
43373
 
0.9%
50354
 
0.9%
Other values (372)29902
74.7%
ValueCountFrequency (%)
15490
13.7%
1.529
 
0.1%
2133
 
0.3%
2.545
 
0.1%
2.777777778297
 
0.7%
ValueCountFrequency (%)
35162
0.2%
34434
0.1%
33217
 
< 0.1%
32519
 
< 0.1%
30717
 
< 0.1%

Visit_days_average
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size625.5 KiB
Less than 30 days visit
14239 
More than 60 days visit
13939 
More than 30 & Less than 60 days visit
11853 

Length

Max length38
Median length23
Mean length27.44143289
Min length23

Characters and Unicode

Total characters1098508
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLess than 30 days visit
2nd rowLess than 30 days visit
3rd rowLess than 30 days visit
4th rowLess than 30 days visit
5th rowLess than 30 days visit
ValueCountFrequency (%)
Less than 30 days visit14239
35.6%
More than 60 days visit13939
34.8%
More than 30 & Less than 60 days visit11853
29.6%
2022-01-09T01:02:06.910912image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2022-01-09T01:02:06.989020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
than51884
21.0%
days40031
16.2%
visit40031
16.2%
less26092
10.5%
3026092
10.5%
more25792
10.4%
6025792
10.4%
11853
 
4.8%

Most occurring characters

ValueCountFrequency (%)
207536
18.9%
s132246
12.0%
t91915
 
8.4%
a91915
 
8.4%
i80062
 
7.3%
e51884
 
4.7%
h51884
 
4.7%
n51884
 
4.7%
051884
 
4.7%
d40031
 
3.6%
Other values (9)247267
22.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter723467
65.9%
Space Separator207536
 
18.9%
Decimal Number103768
 
9.4%
Uppercase Letter51884
 
4.7%
Other Punctuation11853
 
1.1%

Most frequent character per category

ValueCountFrequency (%)
s132246
18.3%
t91915
12.7%
a91915
12.7%
i80062
11.1%
e51884
 
7.2%
h51884
 
7.2%
n51884
 
7.2%
d40031
 
5.5%
y40031
 
5.5%
v40031
 
5.5%
Other values (2)51584
 
7.1%
ValueCountFrequency (%)
051884
50.0%
326092
25.1%
625792
24.9%
ValueCountFrequency (%)
L26092
50.3%
M25792
49.7%
ValueCountFrequency (%)
207536
100.0%
ValueCountFrequency (%)
&11853
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin775351
70.6%
Common323157
29.4%

Most frequent character per script

ValueCountFrequency (%)
s132246
17.1%
t91915
11.9%
a91915
11.9%
i80062
10.3%
e51884
 
6.7%
h51884
 
6.7%
n51884
 
6.7%
d40031
 
5.2%
y40031
 
5.2%
v40031
 
5.2%
Other values (4)103468
13.3%
ValueCountFrequency (%)
207536
64.2%
051884
 
16.1%
326092
 
8.1%
625792
 
8.0%
&11853
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1098508
100.0%

Most frequent character per block

ValueCountFrequency (%)
207536
18.9%
s132246
12.0%
t91915
 
8.4%
a91915
 
8.4%
i80062
 
7.3%
e51884
 
4.7%
h51884
 
4.7%
n51884
 
4.7%
051884
 
4.7%
d40031
 
3.6%
Other values (9)247267
22.5%

Interactions

2022-01-09T01:01:36.566671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:36.688557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:36.820968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:36.961609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.086577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.211538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.336504image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.461475image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.586459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.695794image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.820766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:37.961360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.101918image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.242539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.367511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.508102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.648693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.773663image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:38.914256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.039718image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.164674image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.289656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.430249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.555205image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.695796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.820765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:39.945737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.070712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.195678image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.305029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.429998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.539349image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.648698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.758045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:40.883016image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.008004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.132975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.257927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.398518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.523489image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.664095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.789050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:41.923378image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.063970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.188941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.313910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.454519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.595094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.720065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.860657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:42.985643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.110599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.235568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.360553image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.485525image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.610496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.719829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.844798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:43.969786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.094740image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.219711image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.344682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.469653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.594623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.719594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.844565image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:44.969536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.094498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.219491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.344461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.469416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.610023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.750616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:45.875569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.016176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.141133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.281738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.406690image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.531672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.672258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.797228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:46.915038image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.055600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.180602image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.321161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.446163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.586770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.711694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.852318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:47.977256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.117881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.243794image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.369240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.494242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.619180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.759803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:48.900395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.025333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.165955image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.290941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.431518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.556456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.665838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.790808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:49.915779image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.025127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.165736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.290705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.415660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.540631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.665570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.774951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:50.899935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.024907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.146327image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.271267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.401522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.517779image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.642706image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.752089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:51.884205image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.024797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.165389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.321604image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.462194image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.602803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.727756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:52.868350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.008961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.133925image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.258888image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.411941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.552927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.693518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.818488image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:53.943459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.084051image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.209025image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.349627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.474584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.599835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.724802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.849792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:54.974741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.115333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.255924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.412155image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.537108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.693336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.832429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:55.973022image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:56.113579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:56.254203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:56.379174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:56.519766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-01-09T01:01:56.660356image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-01-09T01:02:07.113957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-09T01:02:07.410794image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-09T01:02:07.691978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-09T01:02:08.004372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-01-09T01:02:08.316831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-01-09T01:01:57.007540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-09T01:01:58.616503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

CustomerIDTransaction_IDTransaction_DateProduct_SKUProduct_DescriptionProduct_CategoryQuantityAvg_PriceDelivery_ChargesCoupon_StatusGenderLocationTenure_MonthsGSTTransaction_Date_Month_xTransaction_Date_Month_yUser_typeyear_numberweek_numbermonth_numberCoupon_CodeDiscount_pctMarketing_DateOffline_SpendOnline_Spendrevenuerevenue_per_customerrevenue_segdays_bw_visitsVisit_days_average
017850166792019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5144.18960381.669High_value2.777778Less than 30 days visit
117850166802019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5144.18960381.669High_value2.777778Less than 30 days visit
217850166962019-01-01GGOENEBQ078999Nest Cam Outdoor Security Camera - USANest-USA2122.776.50Not UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5258.54060381.669High_value2.777778Less than 30 days visit
317850166992019-01-01GGOENEBQ079099Nest Protect Smoke + CO White Battery Alarm-USANest-USA181.506.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.588.00060381.669High_value2.777778Less than 30 days visit
417850167002019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5160.21060381.669High_value2.777778Less than 30 days visit
517850167012019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.716.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5160.21060381.669High_value2.777778Less than 30 days visit
617850167022019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA2153.716.50ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5320.42060381.669High_value2.777778Less than 30 days visit
717850167032019-01-01GGOENEBQ079099Nest Protect Smoke + CO White Battery Alarm-USANest-USA281.506.50Not UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5176.00060381.669High_value2.777778Less than 30 days visit
817850167042019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1256.886.50UsedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5237.04260381.669High_value2.777778Less than 30 days visit
917850167102019-01-01GGOENEBJ079499Nest Learning Thermostat 3rd Gen-USA - Stainless SteelNest-USA1153.7128.78ClickedMChicago120.12019-012019-01New201911ELEC1010.02019-01-0145002424.5182.49060381.669High_value2.777778Less than 30 days visit

Last rows

CustomerIDTransaction_IDTransaction_DateProduct_SKUProduct_DescriptionProduct_CategoryQuantityAvg_PriceDelivery_ChargesCoupon_StatusGenderLocationTenure_MonthsGSTTransaction_Date_Month_xTransaction_Date_Month_yUser_typeyear_numberweek_numbermonth_numberCoupon_CodeDiscount_pctMarketing_DateOffline_SpendOnline_Spendrevenuerevenue_per_customerrevenue_segdays_bw_visitsVisit_days_average
4002116359373542019-09-04GGOEGAAB033814Google Men's Vintage Badge Tee BlackApparel17.606.5ClickedFNew York490.182019-092019-08Existing2019369SALE3030.02019-09-0425001255.0114.1089.152Low_value13.0Less than 30 days visit
4002216359362742019-08-22GGOEGDHQ01539926 oz Double Wall Insulated BottleDrinkware119.996.5ClickedFNew York490.182019-082019-08New2019348EXTRA2020.02019-08-2225001172.9626.4989.152Low_value13.0Less than 30 days visit
4002316359362732019-08-22GGOEGHPJ080310Google Blackout CapHeadgear113.296.0Not UsedFNew York490.052019-082019-08New2019348HGEAR2020.02019-08-2225001172.9619.2989.152Low_value13.0Less than 30 days visit
4002415171366042019-08-25GGOEGAAJ059116Google Men's Short Sleeve Performance Badge Tee PewterApparel18.806.0UsedFNew Jersey480.182019-082019-08New2019348SALE2020.02019-08-2525001941.3811.84146.860Low_value28.0Less than 30 days visit
4002515171366042019-08-25GGOEGALB036514Google Women's Scoop Neck Tee BlackApparel24.806.0ClickedFNew Jersey480.182019-082019-08New2019348SALE2020.02019-08-2525001941.3821.60146.860Low_value28.0Less than 30 days visit
4002615171366042019-08-25GGOEGALJ034415Google Women's Vintage Hero Tee PlatinumApparel14.566.0ClickedFNew Jersey480.182019-082019-08New2019348SALE2020.02019-08-2525001941.3810.56146.860Low_value28.0Less than 30 days visit
4002715171366042019-08-25GGOEGALP034315Google Women's Vintage Hero Tee LavenderApparel17.606.0Not UsedFNew Jersey480.182019-082019-08New2019348SALE2020.02019-08-2525001941.3813.60146.860Low_value28.0Less than 30 days visit
4002815171366042019-08-25GGOEGALQ036614Google Women's Scoop Neck Tee WhiteApparel24.806.0UsedFNew Jersey480.182019-082019-08New2019348SALE2020.02019-08-2525001941.3817.28146.860Low_value28.0Less than 30 days visit
4002915171387542019-09-22GGOEGAAJ073413Google Men's Short Sleeve Hero Tee HeatherApparel115.196.0ClickedFNew Jersey480.182019-092019-08Existing2019389SALE3030.02019-09-2225001895.7321.19146.860Low_value28.0Less than 30 days visit
4003015171387552019-09-22GGOEGAFB035815Google Men's Zip HoodieApparel144.796.0ClickedFNew Jersey480.182019-092019-08Existing2019389SALE3030.02019-09-2225001895.7350.79146.860Low_value28.0Less than 30 days visit